Discriminative feature extraction for speech recognition in noise

نویسندگان

  • Ángel de la Torre
  • Antonio M. Peinado
  • Antonio J. Rubio
  • Pedro García-Teodoro
چکیده

Signal representation is crucial for designinga speechrecognizer. The feature extractor selects the information to be used by the classifier to perform the recognition. In noisy environments, the data vectors representing the speech signal are changed and the recognizer performance is degraded by two main facts: (1) the mismatch between the training and the recognition conditions and (2) the degradation of the signal to be recognized. In such a situation, the representation of the speech signal plays an important role. In this paper, we analyze the importance of the representation for speechrecognition in noise. We apply the Discriminative Feature Extraction (DFE) method to optimize the representation. The experiments presented in this work show that the DFE method, which has been successfully applied in clean environments, leads also to improvements of the speech recognizers in noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training

To improve speech recognition performance, acoustic feature transformation based on discriminant analysis has been widely used. For the same purpose, discriminative training of HMMs has also been used. In this letter we investigate the effectiveness of these two techniques and their combination. We also investigate the robustness of matched and mismatched noise conditions between training and e...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Non-linear feature extraction for robust speech recognition in stationary and non-stationary noise

An analysis-based non-linear feature extraction approach is proposed, inspired by a model of how speech amplitude spectra are affected by additive noise. Acoustic features are extracted based on the noiserobust parts of speech spectra without losing discriminative information. Two non-linear processing methods, harmonic demodulation and spectral peak-to-valley ratio locking, are designed to min...

متن کامل

Fuzzy-based discriminative feature representation for children's speech recognition

Automatic recognition of the speech of children is a challenging topic in computer-based speech recognition systems. Conventional feature extraction method namely Mel-frequency cepstral coefficient (MFCC) is not efficient for children’s speech recognition. This paper proposes a novel fuzzy-based discriminative feature representation to address the recognition of Malay vowels uttered by children...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997